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FOREWORD 


A  primary  objective  of  research  task  2211H1  of  the  U.S.  Am^  Research  Institute  for  the 
Behavioral  and  Social  Sciences  (ARI)  is  to  provide  enhancements  to  selection  through 
development  and  refinement  of  new  measures. 

The  focus  of  this  research  is  to  develop  biodata  indicators  of  attrition  from  training  and 
leadership  potential  and  performance  that  will  measure  relevant  temperament  constructs  and 
be  suitable  for  use  in  an  admissions  package  at  the  U.S.  Military  Academy  (USMA).  Hie 
results  of  this  phase  of  the  research  indicate  that  biodata  scales  can  be  used  to  provide 
indexes  of  attrition  from  training  and  leadership  performance  during  a  cadet's  first  6  months 
at  USMA  In  addition,  the  biodata  measures  demonstrate  properties  in  cadets  that  make 
them  more  suitable  for  admissions  than  their  temperament  counterparts.  Moreover,  the 
temperament  and  biodata  measures  add  incremental  validity  over  and  above  that  of  measures 
currently  used  for  admissions  to  USMA 

This  research  is  the  result  of  a  collaborative  effort  between  the  Office  of  Institutional 
Research  (OIR)  at  USMA  and  ARI  initiated  in  November  1989.  The  commander  and 
researchers  at  OIR  have  been  apprised  of  research  results  on  a  continuous  basis.  Follow-up 
research  will  include  cross-validation  of  results  and  additional  measures  of  performance  from 
subsequent  stages  of  the  cadets’  tenure  at  USMA  and  in  the  officer  corps. 


EDGAR  M.  JOHNSON 
Technical  Director 
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CAPTURING  TEMPERAMENT  CONSTRUCTS  WITH  OBJECTIVE  BIODATA 


EXECUTIVE  SUMMARY 


Requirement: 

The  purpose  of  this  research  is  to  develop  biodata  indicators  of  attrition  from  training 
and  leadership  potential  and  performance  that  will  measure  relevant  temperament 
constructs,  yet  still  be  potentially  suitable  for  use  in  an  admissions  package  at  the  U.S. 
Military  Acadeny  (USMA). 


Procedure: 

The  Army  temperament  measure  Assessment  of  Background  and  Life  Experience 
(ABLE)  and  a  73-item  biodata  instrument  developed  for  this  research  were  administered  to 
1,325  members  of  the  USMA  Qass  of  1994.  Criterion  measures  were  attrition  from  the  6- 
week  preliminary  summer  training  period,  leadership  ratings  from  that  summer  period,  and 
leadership  ratings  from  the  fall  semester.  The  biodata  items  were  coded  in  order  to  produce 
analogs  to  the  five  ABLE  scales  in  the  research.  The  relationship  of  each  ABLE  scale  and  its 
biodata  analog  to  each  of  the  three  criteria,  as  well  as  the  incremental  contributions  of  the 
total  ABLE  and  its  biodata  analog  over  and  above  that  of  the  currently  used  USMA  Whole 
Candidate  Score  (WCS),  were  evaluated.  In  addition,  the  relative  contribution  of  each 
component  ABLE  scale  as  an  indicator  for  each  criterion  was  assessed.  Finally,  the 
susceptibility  of  both  the  ABLE  and  biodata  scales  to  socially  desirable  responding  was 
investigated. 


Findings: 

The  biodata  scales  showed  strong  relationships  to  their  equivalent  ABLE  scales  and 
smaller  relationships  to  the  other  ABLE  .cales.  When  compared  with  the  ABLE  scales 
regarding  their  relationship  to  the  criteria,  the  biodata  measures  demonstrated  comparable 
validities  in  13  of  15  cases.  Further,  for  each  criterion,  either  overall  ABLE  or  the  biodata 
equivalent  added  incremental  validity  over  and  above  the  WCS.  Four  of  the  five  individual 
biodata  scales,  as  well  as  the  overall  biodata  scale,  had  significantly  smaller  correlations  with  a 
social  desirability  scale  than  the  equivalent  ABLE  scale. 


vii 


Utilization  of  Findings: 


The  results  of  this  research  can  be  used  to  develop  an  indicator  of  attrition  and 
leadership  potential  that  will  enhance  the  USMA  admissions  package.  The  research  also 
refined  the  methodology  for  developing  biodata  analogs  to  temperament  measures.  This 
methodology  will  prove  useful  in  ongoing  investigations  of  the  feasibility  for  using  these 
measures  in  officer  and  enlisted  selection. 
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CAPTURING  TEMPERAMENT  CONSTRUCTS  WITH  OBJECTIVE  BIODATA 


INTRODUCTION 

In  recent  years,  there  has  been  a  good  deal  of  interest  in  the  use  of  biodata  for 
military  selection,  hig^ghted  by  a  number  of  current  efforts  in  the  joint  and  individual 
service  arenas  (Trent,  cfuenette,  A  Pass,  1989;  Watson,  1989).  Reviews  of  selection 
measures  have  found  biodata  validity  coefficients  to  be  impressive  compared  with  other 
measures  (Asher  A  Sciarrino,  1974;  GhiseUi,  1966;  Reilly  A  Chao,  1982).  Recent 
research  indicates  that  biodata  validities  may  be  more  stable  over  time  and  more 
generalizable  across  organisations  than  previously  Uiought  (Rothstein,  Schmidt,  Erwin, 
Owens,  A  Sparks,  1990).  Nevertheless,  certain  concerns  involving  the  use  of  biodata  in 
applied  settings  remain.  This  paper  describes  a  research  effort  focused  on  dealing  with 
these  concerns,  conducted  members  of  the  U.S.  Arn^  Research  Institute  for  the 
Behavioral  and  Social  Sciences  in  conjunction  vnth  the  Office  of  Institutional  Research  at  the 
U.S.  Military  Academy  at  West  Point. 

Eesfcarcb  Pcoblfim:  The  Keying  Dilcinma 

Currently,  two  methods  are  most  commonly  used  for  keying  biodata,  that  is, 
determining  the  numerical  value  (weight)  to  be  assigned  to  ea^  response  alternative 
within  an  item  (Mumford  A  Owens,  1987).  The  first  approach,  empirical  keying,  was  the 
sole  method  used  in  early  biodata  research  and  continues  to  be  used  by  many 
practitioners.  With  empirical  keying,  weights  are  assigned  to  each  alternative  based  on 
its  mean  score  on  the  criterion  being  used.  For  example,  if  the  criterion  is  leader 
ratings,  the  value  on  an  item  which  has  the  highest  average  rating  is  assignee  the  highest 
score.  The  same  is  done  for  each  alternative,  so  that  the  continuum  of  values  within  the 
item  is  arranged  to  reflect  scores  on  the  criterion.  Purely  empirical  keying  is  highly 
sensitive  to  sample  characteristics  and  can  thereby  lead  to  an  optimal  correlation  with 
the  criterion.  However,  when  the  key  is  cross-validated,  the  regression  coefficient  often 
is  much  smaller  than  that  of  the  initial  derivation  sample,  a  phenomenon  referred  to  as 
shrinkage.  Moreover,  the  method  has  been  termed  "dustbowl  empiricism"  by  critics  for 
being  atbeoretical  and  failing  to  advance  understanding  of  the  underlying  antecedents  of 
successful  performance  (Dunnette,  1962;  Pace  A  Scboenfeldt,  1977). 

Some  researchers  have  instead  championed  a  rational  approach  to  biodata,  in 
which  item  alternatives  are  assigned  a  priori  values  based  on  a  presumed  relationship  of 
the  item  to  a  specific,  unitary  construct  (Mitchell  &  Klimoski,  1982).  Thus,  the  rational 
approach  is  usually  an  attempt  to  measure  temperament  or  other  constructs  with 
bi^ata-like  items.  Adherence  to  this  strategy  leads  to  a  preference  for  items  that  can  be 
clearly  related  to  only  a  single  construct  and  then  combined  into  homogeneous  scales. 
Advocates  of  rational  biodata  development  claim  that  their  method,  which  uses 
predetermined  values  for  item  responses,  will  reduce  shrinkage  because  it  is  not  fitted  to 
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sample^specific  idiosyncracies.  A  possible  problem  with  this  approach^  however,  is  that 
responses  to  complex,  heterogeneous  behaviors  would  also  have  to  be  coded  in  terms  of 
single  constructs,  even  if  they  were  really  a  function  of  multiple  inftaences.  Also,  by 
ma^g  a  priori  decisions  about  item  directionality  aaoss  any  or  all  criteria,  the 
possibility  that  a  certain  behavior  will  be  beneficial  for  some  outcomes  and 
counterproductive  for  others  is  generally  downplayed. 

In  recent  years,  another  issue  has  surfaced.  Many  researchers  have  e]q>ressed 
concern  about  the  possibility  of  socially  desirable  responding  and  faking  on  self-report 
measures,  notably  temperament  measures  (Crowne  &  Marlowe,  1960;  Hough,  Eaton, 
Dunnette,  Kamp,  &  McCoy,  1990;  Paulhus,  1984).  >Miile  the  same  concern  has  been 
expressed  about  biodata,  one  proposed  solution  has  been  to  limit  biodata  to  objective 
and  verifiable  items.  This  presents  a  problem  for  those  advocating  rational  keying,  in 
that  objective  and  verifiable  actions  tend  to  be  heterogenous  (determined  by  multiple 
causes)  and  therefore  difficult  to  attribute  to  a  single  temperament.  Conversely, 
adherence  to  this  strategy  would  eliminate  the  use  of  subjective,  homogenous  items. 
Because  researchers  using  the  rational  approach  do  not  limit  themselves  to  objective, 
historical  and/or  verifiable  items,  their  measures  are  often  indistinguishable  from 
temperament  scales,  and  may  be  more  fakable  than  empirically  keyed  biodata. 

In  this  research  effort,  an  attempt  was  made  to  gain  the  conceptual  benefits  of 
rational  methods,  while  gaining  the  less  fakable  properties  associated  with  objective, 
verifiable  biodata.  Specifically,  an  attempt  was  made  to  key  verifiable  biodata  directly  to 
temperament  scales,  and  then  use  those  scales  rationally  with  multiple  criteria.  Though 
no  attempt  was  made  to  assign  items  exclusively  to  a  single  construct,  the  goal  was  to 
determine  if  biodata  scales  could  be  utilized  to  parallel  individual  temperament  scales. 

A  secondary  concern,  if  this  method  constitutes  an  optimal  use  of  objective  biodata  items 
compared  with  some  form  of  direct  empirical  keying,  is  to  be  covered  in  subsequent 
research. 

The  Research  Program  at  USMA 

Leadership  research  has  been  an  abiding  interest  of  the  U.S.  Military  Academy 
(USMA)  at  West  Point  for  many  years  (Page,  1934).  A  current  example  of  this 
emphasis  is  the  Leadership  Development  Project,  an  ongoing  research  effort,  approved 
in  March,  1988,  and  directed  by  the  USMA  Office  of  Institutional  Research  (OIR).  The 
stated  goals  of  the  project  are  to  a)  improve  measurement  of  candidate  leader  potential, 
b)  improve  measurement  of  cadet  leadership  performance,  and  c)  measure  contributions 
of  USMA  graduates  to  the  comm.)n  defense.  In  late  1989,  OIR  and  the  U.S.  Army 
Research  Institute  (ARI)  decided  to  jointly  pursue  their  mutual  interest  in  working  on 
the  first  of  these  goals. 

USMA  uses  a  three-pronged  approach  to  selection,  attempting  to  find  candidates 
who  will  excel  academically,  physically,  and  militarily.  As  officer  military  excellence  is 
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largely  defined  as  the  ability  to  lead  others,  measurement  of  leadership  potential  and 
performance  is  a  priority.  OIR/USMA  felt  that  although  candidate  academic  and 
physical  capabilities  were  adequately  measured,  improved  measurement  of  leadership 
potential  was  possible.  In  addition,  while  SAT  scores  have  had  a  strong,  demonstrated 
relationship  to  academic  attrition,  interest  was  expressed  in  finding  better  indicators  of 
nonacademic  attrition. 

Current  USMA  Admissions  Procedures 

Currently,  an  important  indicator  of  candidate  potential  used  in  the  admissions 
decision  at  West  Point  is  based  60%  on  an  applicant’s  standardized  test  scores  (i.e.,  SAT, 
ACT)  and  graduating  rank  in  high  school;  30%  on  the  Leadership  Potential  Score  (LPS), 
derived  from  the  School  Official  Evaluation  (an  evaluation  form  filled  out  by  high  school 
instructors),  and  the  Candidate  Activities  Record  (CAR),  a  checklist  of  extracurricular 
activities  and  varsity  sports;  and  10%  on  scores  on  the  Physical  Aptitude  Examination 
(PAE).  This  information  is  combined  in  a  weighted  composite  known  as  the  Whole 
Candidate  Score  (WCS).  The  information  on  the  CAR  is  similar  to  that  on  a  biodata 
instrument.  However,  scoring  keys  were  based  on  content  validity  judgments  of  USMA 
persoimel  rather  than  on  criterion-related  validity  judgements.  This  raised  the  possibility 
that  an  alternative  approach  might  yield  higher  relationships  with  relevant  outcome 
measures. 

In  conjunction  with  the  Leadership  Development  Project,  OIR  explored 
approaches  toward  improving  USMA’s  selection  procedures  with  two  goals  in  mind.  The 
first  was  to  attempt  to  capture  motivational  indicators  of  leadership  performance,  with 
the  possible  goal  of  including  these  measures  in  future  admissions  packages.  The  second 
was  to  explore  empirical  methods  of  scoring  new  and  existing  inventories  of  previous 
behaviors  and  experiences.  Accordingly,  OIR/USMA  suggested  that  ARI  administer  a 
biodata  questionnaire,  as  well  as  ABIJE,  an  Anny  temperament  measure,  in  order  to 
accomplish  these  goals.  The  next  section  describes  ABLE,  the  Army’s  temperament  test, 
and  its  role  in  the  current  project.  The  following  section  defines  what  a  biodata  measure 
is,  especially  in  contradistinction  to  a  temperament  measure,  and  describes  the  guidelines 
considered  in  the  development  of  the  USMA  biodata  measure.  Finally,  the  specific 
approach  and  goals  of  this  research  are  outlined. 


Thg  Army’s  ABLE 

The  Assessment  of  Background  and  life  Experiences  (ABLE),  is  a  temperament 
measure  developed  and  validated  by  the  U.S.  Army  Research  Institute  as  part  of  a  long¬ 
term  research  program  called  Project  A,  which  was  designed  to  revalidate  the  Armed 
Services  Vocational  aptitude  battery  (ASVAB)  and  design  supplementary  tests  measuring 
additional  constructs.  ABLE  was  included  in  Project  A  to  capture  the  motivational 
element  of  performance  ("will  do"),  as  opposed  to  the  ability  ("can  do")  element.  ABLE 
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is  also  under  consideration  in  the  Joint  Services  arena  as  part  of  a  measure  of 
adaptability  to  military  life. 

Scale  development  involved  reviewing  12  major  personality  inventories  and  then 
reducing  the  number  of  temperament  dimensions  by  eliminating  redundancy  and  focusing 
on  predictors  of  job  performance  (Hough  et  al.,  1990).  In  its  complete  form,  ABLE 
consists  of  10  scdes  measuring  5  constructs  (see  Table  1).  In  addition,  validity  scales, 
which  indicate  whether  or  not  the  respondents  answers  reflect  faking  or  social  desirability 
distortion,  were  included.  Nearly  50,000  soldiers  in  21  Military  Occupational  Specialties 
(MOS)  were  tested  and  their  ABLE  scores  were  used  to  predict  NCX)  leadership 
potential,  disciplinary  problems,  and  attrition. 


Table  1 

Temperament  Scales  by  Construct  in  the  Assessment  of  Background  and  Life 
Experience  (ABLE) 


Construct 

Scale 

Stress  Tolerance 

Emotional  Stability 

Dependability 

Nondelinquency 

Tradition^  Values 
Conscientiousness 

Achievement/Leadership 

Work  Orientation 

Self'Esteem 

Dominance 

Energy  Level 

Physical  Condition 

Physical  Condition 

Locus  of  Control 

Internal  Control 

Agreeableness/likability 

Cooperativeness 

Response  Validity  Scales 

Non-Random  Response 

Social  Desirability 

Previous  findings  indicate  that  ABLE  predicts  enlisted  attrition,  effort  and 
leadership,  and  personal  discipline  (Hough  et  al.,  1990).  In  Project  A  researdi,  lower 
scores  on  ABLE  were  significantly  related  to  greater  rates  of  attrition,  with  the 
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relationship  most  pronounced  among  those  scoring  low  on  ABLE  (White,  Nord  &  Mael, 
1990).  ABLE  was  also  related  to  probability  of  graduation  of  USMA  graduates  and 
other  trainees  at  the  Ranger  school  course.  In  addition,  the  Achievement  construct  of 
ABLE  was  found  to  be  a  significant  predictor  of  effort  and  leadership  (for  the  Work 
Orientation  scale,  uncorrected  i  »  .23).  Other  scales  which  significantly  predicted  effort 
and  leadership  included  Dominance,  Energy  Level,  and  Emotional  Stability.  Finally,  the 
ABLE  Dependability  construct  significantly  predicted  discipline  problems  among  e^ted 
soldiers,  with  (uncorrected)  validities  ranging  fi^om  23  to  29  for  the  three  Dependability 
scales.  ABLE  was  thus  seen  as  an  attractive  measure  of  adaptability  because  it 
specifically  addressed  dominance  and  leadership  proclivities,  and  bemuse  of  its 
documented  relationship  to  prediction  of  attrition,  indiscipline,  and  leader  potential 
among  NCXDs. 

However,  ABLE  has  potential  drawbacks  for  use  in  an  enlistment  or  admissions 
package.  One  is  the  fear  of  extensive  faking  and  socially  desirable  responses.  ABLE  is  a 
relatively  transparent  test,  with  no  attempt  to  obscure  desirable  responses,  and  with 
virtually  all  items  arranged  in  a  linear  continuum  of  desirability.  In  a  previous 
administration  with  enlisted  soldiers,  faking  has  not  contaminated  ABl£*s  validity 
(Hough,  et  al.,  1990).  However,  the  fear  of  faking  would  be  increased  in  an  admissions 
situation,  where  the  instrument  is  often  taken  at  home  under  the  tutelage  of  parents  and 
other  advisors.  A  second  concern  was  that  some  ABLE  items  concerned  somewhat 
intrusive  and  "psychological"  topics,  such  as  physical  symptoms,  fears,  anxieties,  and 
feelings  of  depression  and  failure.  USMA  researchers  felt  that  these  types  of  items  could 
be  resented,  thus  driving  away  capable  candidates. 

Therefore,  the  researchers  sought  to  determine  if  ABLE  constructs  could  be 
measured  with  more  palatable  biodata  items.  Given  that  a  biodata  measure  was  sought 
specifically  because  of  the  qualities  distinguishing  it  from  temperament  measures,  it 
became  crucial  to  define  the  unique  characteristics  of  biodata,  as  well  as  how  they  differ 
from  temperament  measures.  The  guidelines  which  emerged  from  this  effort  are 
described  next. 

What  Makes  Biodata  Biodata? 

There  is  considerable  controversy  regarding  the  criteria  for  specifying  the 
domain  and  attributes  of  biodata  items  (Asher,  1972;  Gandy,  Outerbridge,  Shart  Sc  Dye, 
1989;  Heiuy,  1965;  Strieker,  1987).  In  addition,  while  some  have  attempt^  to 
differentiate  biodata  items  from  temperament,  attitude,  or  interest  items  (Guthrie,  1944; 
Mumford  &  Owens,  1987),  in  practice,  many  items  termed  "biodata"  are  indistinguishable 
from  self-report  temperament  items  (Crosby,  1990).  It  is  not  uncommon  to  find  items 
about  internal  states,  opinions,  and  reactions  to  hypothetical  situations  included  in 
biodata  measures.  The  result  has  been  a  continued  blurring  of  what  constitutes  biodata. 
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The  confusion  is  especially  problematic  in  light  of  claims  that  biodata  scales  are 
more  resistant  to  social  desirability  distortion  (Telenson  et  al.,  1983)  and  generally 
achieve  higher  validities  (Asher,  1972;  Reilly  &  Chao,  1982)  than  temperament 
measures.  However,  this  may  be  true  only  of  certain  types  of  biodata,  such  as  verifiable 
items.  It  is  therefore  worthwhile  to  enumerate  the  attributes  that  have  been  used  to 
define  biodata  and  differentiate  it  from  other  self-report  measures. 

Defining  Biodata 

Biodata  items  attempt  to  measure  previous  and  current  life  events  which  have 
shaped  the  behavioral  patterns,  dispositions,  and  values  of  the  person.  Owens  has  stated 
that  "one  of  our  most  basic  measurement  axioms  bolds  that  the  best  predictor  of  what  a 
man  mil  do  in  the  future  is  what  he  has  done  in  the  past"  (1976,  p.  625).  It  is  presumed 
that  a  person’s  outlook  is  affected  by  life  experiences  and  that  each  experience  has  the 
potential  to  make  subsequent  life  choices  more  or  less  desirable,  palatable,  or  feasible. 
One  possible  reason  is  that  the  focal  experience  reinforces  a  pattern  of  behavior. 
Alternatively,  the  focal  experience  may  be  partly  or  wholly  determined  by  earlier  causal 
determinants-  genetic,  dispositional,  or  learned-  which  account  for  variations  in  both 
earlier  and  current  behavior. 

Moreover,  every  experience  or  series  of  experiences  which  conceivably  categorizes 
(or  stigmatizes)  a  person  has  the  potential  to  shape  that  person's  behavioral  patterns, 
though  each  component’s  influence  is  mitigated  by  the  effects  of  all  other  identifications. 
Thus,  when  a  person  associates  with  a  team,  club,  school,  or  any  other  "psychological 
group,"  the  person  takes  on  (to  varying  degrees)  the  aspirations,  preferences,  values,  and 
self-perceptions  which  are  endemic  to  group  members.  Even  negative  categorizations 
(e.g.,  the  inability  to  swim,  ride  a  bike,  or  drive  a  car  at  the  same  age  as  classmates),  or 
so-called  "input  variables"  (Owens  &  ^hoenfeldt,  1979),  such  as  place  of  upbringing,  size 
of  high  school,  and  parental  occupation,  could  place  the  person  in  a  self-perceived 
category  with  a  spe^c  profile. 

From  a  biodata  perspective,  therefore,  previous  events  and  experiences  are  not 
only  indications  of  underlying  dispositions,  but  are  themselves  seen  as  shaners  of 
subsequent  behavior.  By  contrast,  temperament  measures  primarily  attempt  to  capture 
somewhat  stable  dispositional  tendencies.  Thus,  the  typical  temperament  item  asks  the 
respondent  direct  questions  about  dispositions.  Alternatively,  a  temperament  item  may 
infer  the  construct  from  tendencies  evident  in  narrowly  focused  reactions  to  past  and 
current  events,  or  from  expressed  responses  to  hypothetical  and  future  situations.  When 
reactions  are  sampled,  they  are  seen  merely  as  outcomes  of  the  pre-existing 
temperament. 

In  summary,  the  realm  of  biodata  is  more  inclusive  than  temperament  in  terms  of 
content,  in  that  it  includes  behavioral  antecedents  and  indicators  of  skills,  abilities,  and 
temperaments  (Mumford  &  Stokes,  1991).  Conversely,  because  biodata  items  attempt  to 
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measure  only  events  and  behaviors  that  have  definitely  occurred,  many  researchers  have 
argued  that  biodata  items  be  more  restrictive  in  their  attributes  than  temperament  ones. 
The  attributes  fall  into  two  categories:  those  that  aim  at  increasing  the  accuracy  of  the 
information  generated  as  biodata,  and  those  that  seek  to  limit  the  domain  of  biodata 
content  on  legal  or  ethical  grounds.  These  attributes  are  drawn  partly  from  an  earlier 
typology  by  Asher  (1972),  and  includes  new  categories  mentioned  previously  others 
(Barge,  1987;  Strieker,  1987).  They  have  been  reviewed  in  depth  by  Mael  (in  press),  and 
are  summarized  below.  Examples  of  each  attribute  appear  in  Table  2. 

Biodata  Item  Attributes 

Historical  versus  hypothetical.  Biodata  items  should  pertain  solely  to  historical 
events,  events  which  have  taken  place,  or  continue  to  take  place.  This  would  exclude 
items  about  behavioral  intentions  or  about  presumed  behavior  in  a  hypothetical  situation. 
This  appears  to  be  the  core  attribute  of  biodata  items. 

External  versus  Internal.  Some  have  argued  that  biodata  items  should  deal  with 
external,  though  not  necessarily  publicly  seen,  actions.  This  would  exclude  items  about 
thoughts,  attitudes,  opinions,  and  unexpressed  reactions  to  events.  Items  about  what  one 
typically  does  in  situations  could  be  considered  historical  and  external.  While  the 
external  attribute,  as  well  the  objective  and  verifiable  attributes  mentioned  below,  have 
been  ignored  by  a  number  of  researchers  (e.g.  England,  1971;  Glennon,  Albright,  & 
Owens,  1966;  Russell,  Mattson,  Devlin  &  Atwater,  1990),  each  may  be  crucial  to  claims 
of  greater  freedom  from  distortion  for  biodata  compared  to  temperament  scales. 

Objective  and  First-hand  ^rsus_  Subjective.  Some  who  endorse  the  external 
attribute  also  feel  that  biodata  should  be  objective  recollections,  requiring  only  the 
faculty  of  recall.  Subjective  interpretation  of  events,  such  as  assessing  if  one  was 
disappointed,  angry,  or  depressed  in  a  given  situation,  would  not  fit  this  attribute. 
Evaluation  of  one’s  qualities  or  performance  relative  to  that  of  others  also  would  be 
considered  subjective.  A  corollary  would  be  that  biodata  items  ask  only  for  the  first¬ 
hand  knowledge  of  the  respondent,  as  opposed  to  estimation  of  how  others  ^ers, 
parents,  teachers)  would  evaluate  one’s  performance  or  temperament,  which  involves  an 
additional  level  of  speculative  subjectivity. 

Discrete  versus  Summary  Actions.  Methodologically,  it  may  be  preferable  to 
focus  on  discrete  actions,  dealing  with  a  single,  unique  behavior  (e.g.,  age  when  received 
driver’s  license),  as  opposed  to  summary  responses  (e.g.  average  time  spent  studying). 
Responses  to  summary  items  also  require  computation  or  estimation  and  increase  ^e 
chance  of  inaccuracy.  However,  with  a  regularly  performed  behavior,  summary  recall 
could  be  more  realistic  and  accurate  than  recall  of  a  single,  arbitrarily  chosen  instance. 
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Table  2: 

A  Taxonomy  of  Biodata  Items 


Historical 

How  old  were  you  when  you  got  your  first  paying 
job? 


External 

Did  you  ever  get  fired  from  a  job? 

Objective 

How  many  hours  did  you  study  for  your  real-estate 
license  test? 


First-hand 

How  punctual  are  you  about  coming  to  work? 


Discrete 

At  what  age  did  you  get  your  driver’s  license? 


Verifiable 

What  was  your  grade  point  average  in  college?  ■ 
Were  you  ever  suspended  from  your  Little  League 
team? 

Controllable 

How  many  tries  did  it  take  you  to  pass  the  CPA 
exam? 

Eoual  access 
Were  you  ever  class 
preridcnt? 

fob  rriBYMl 

How  many  units  of  cereal  did  you  sell  during  the 
last  calendar  year? 

Non-invasive 

Were  you  on  the  tennis  team  m  college? 


Future  or  hypothetical 

>^^t  position  do  you  think  you  will  be  holding  b 
tea  years?  What  would  you  do  if  another  person 
screamed  at  you  in  public? 

Internal 

What  is  your  attitude  toward  friends  who  smoke 
marijuana? 

Subjective 

Would  you  describe  yourself  as  shy? 

How  ac^enturous  are  you  compared  to  your 
co-workers? 

Second-hand 

How  would  your  teachers  describe  your 
punctuality? 

Summhtjtt 

How  many  hours  do  you  study  during  an  average 
week? 

Non-verifiable 

How  many  servings  of  fresh  vegetables  do  you  eat 
everyday? 


Non-controUable 

How  many  brothers  and  sisters  do  you  have? 


Non-eoual  access 

Were  you  captain  of  the  footbaU  team? 


Not  job  relevant 

Are  you  proficient  at  oossword  puzzles? 


Invasive 

How  many  young  children  do  you  have  at  home? 
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Verifiable.  A  verifiable  item  is  an  item  that  can  be  corroborated  from  an 
independent  source.  Item  verifiability  thus  goes  beyond  both  the  external  event  and 
c  jjective  criteria.  The  optimal  source  of  verification  is  archival  data,  such  as  sch'jol 
transcripts  or  work  records.  Alternatively,  the  testimony  of  knowledgeable  persons,  such 
as  a  teacher,  employer,  or  coach,  is  also  considered  verification  by  most  researchers. 
Asher  (1972)  and  Strieker  (1987)  have  advocated  exclusive  use  of  verifiable  items, 
though  others  utilize  or  condone  the  use  of  non-verifiable  items  (e.g.,  England,  1971; 
Glennon,  et  al.  1966)  and  some  advocate  interleaving  verifiable  and  non>verifiable  items 
(Landy  &  Trumbo,  1980;  Mumford  Sc  Stokes,  1991).  Merely  warning  respondents  that 
answers  will  be  verified  can  also  reduce  faking  (Sc^ader  Sc  Osbum,  1977).  Verifiability 
should  be  less  necessary  with  discrete  and  publidy  witnessed  items  for  which  "faking 
good"  would  require  conscious  lying.  When  developing  biodata,  obscuring  the  "right" 
answers  and  using  subtle  items  also  should  discourage  socially  desfrable  responses,  even 
without  the  threat  of  verification. 

Controllable  and  Equally  Accessible.  From  the  perspective  that  all  life  events  can 
potentially  shape  and  affect  later  behavior,  there  is  no  reason  to  differentiate  between 
experiences  that  a  person  has  consciously  chosen  to  undertake  and  those  that  were 
components  of  the  person’s  environment.  Accordingly,  the  biodata  instruments  of 
numerous  researchers  include  both  controllable  and  noncontrollable  items  (e.g., 

Mumford  Sc  Stokes,  1991;  Richardson,  Bellows,  Sc  Henry,  1985;  Russell  et  d.,  1990). 
Strieker  (1987),  on  the  oUier  hand,  argues  that  it  is  unethical  to  evaluate  people  based 
on  noncontrollable  items  pertaining  to  parental  behavior,  geographic  background,  or 
socioeconomic  status.  He  also  considers  items  dealing  with  skills  and  eiqperiences  not 
equally  accessible  to  all  applicants,  such  as  tractor-driving  ability  or  playing  varsity 
football,  to  be  unfair.  Similarly,  the  developers  of  the  Armed  ^rvices  ^plicant  Profile 
(ASAP)  and  the  Air  Force’s  Leadership  Effectiveness  >ssessment  Profile  (LEAP),  two 
biodata  measures  for  military  use,  have  also  attempted  to  delete  all  non-controllable 
items  from  their  instrument  (Trent,  Quenette,  Sc  Pass,  1989;  Watson,  1989). 

In  practice,  however,  strict  adherence  to  these  restrictions  would  lead  to  exclusion 
of  most  life  experiences  likely  to  be  related  to  later  behavior,  as  well  as  many  items 
typically  found  on  school  and  job  application  blanks.  This  would  present  an  especially 
severe  constraint  when  sampling  applicant  pools  without  extended  job  histories,  such  as 
military  applicants.  Because  of  this  constraint,  the  LEAP  researchers  felt  compelled  to 
compensate  with  "behavioral  intention"  items  (Watson,  1989),  non-historical  speculations 
about  behaviors.  Therefore,  for  both  cc.«ceptual  and  practi^  reasons,  it  is  argued  that 
these  two  attributes  need  not  be  adhered  to. 

Visibly  Job  Relevant.  Virtually  all  life  experiences  are  potentially  "job  relevant"  if 
they  contribute  to  the  skill  base,  self-efficacy,  or  values  of  the  individual,  even  if  the 
prospective  job  has  no  activities  that  are  superficially  analogous  to  the  previous 
experience.  Nevertheless,  Gandy  et  al.  (1989),  citing  legal  constraints,  feel  that  at  least 
in  the  public  sector,  this  type  of  job  relevance  may  be  insufficient.  If  job  relevancy  needs 
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to  be  narrowly  defined  as  showing  face  valid  job  pertinence  then  the  domain  of  tmlv 
relevant  items  would  be  severely  limited.  Moreover,  paradoxically,  items  which  fit  the 
nanowest  dcdnition  of  job  relevant  would  be  the  most  transparent  and  most  fakable. 

Invasion  of  privacy.  A  final  concern,  which  pertains  to  all  self-report  items, 
involves  invasion  of  privacy.  Maiqr  items  pertaining  to  topics  sudi  as  national  ori^ 
religious  or  political  affiliation,  or  financial  status,  may  fall  afoul  of  Federal,  state,  or 
local  privacy  protection  laws  (Arvey,  1983;  Gandy  et  al.,  1989;  Van  Rijn,  1980). 
Genuinely  intrusive  questions,  such  as  those  dealing  with  sexual  behavior,  bodily 
functions,  or  .specific  religious  and  ethnic  practices,  are  also  likely  to  incur  resistance  and 
resentment  and  therein  encourage  willful  faking,  random  responding,  or  other  behavior 
aimed  at  foiling  the  testers.  Unfortunately,  the  parameters  of  intrusiveness  and  invasion 
of  privacy  have  yet  to  be  defined  clearly  in  the  literature. 


Summary 

The  core  attribute  of  a  biodata  item  is  that  it  addresses  an  historical  event  or 
experience.  The  rationale  is  that  previous  events  shape  the  behavioral  patterns, 
attitudes,  and  values  of  the  person,  and  combine  with  individual  temperaments  to  define 
the  person’s  identity.  Other  attributes,  though  not  defining  biodata,  may  have  the 
advantage  of  minimizing  social  desirability  distortion.  These  include  liidting  items  to 
those  regarding  external  events,  those  requiring  only  objective,  first-hand  recollection, 
and  those  pertaining  to  verifiable  events.  Items  involving  discrete,  unique  events  may 
also  be  preferred  when  appropriate.  Exclusive  use  of  controllable  and  equally  accessible 
items,  as  well  as  items  narrowly  defined  as  "job  relevant”  should  not  be  required  unless 
legally  mandated.  While  clearly  intrusive  items  are  offensive  and  probably 
counterproductive,  definition  of  invasiveness  remains  unclear. 

Because  of  concerns  about  faking  associated  with  subjective  items,  the  items  used 
in  the  current  research  effort  were  all  historical,  external,  objective,  and  first-person,  and 
primarily  verifiable,  at  least  in  principle.  Both  controllable  and  non-controUable  items 
were  used,  and  "relevance"  was  of  necessity  defined  broadly.  Attempts  were  made  to 
avoid  invasive  or  otherwise  inflanunatoiy  items. 


The  USMA  Research  Effoa 

As  mentioned  above,  OIR  researchers  sought  to  determine  if  temperament 
constructs,  specifically  those  in  the  ABLE,  could  be  measured  with  biodata  items  without 
loss  of  validity.  To  do  this,  ABLE  scales  deemed  most  appropriate  for  the  USMA 
candidate  pool  were  selected.  Next,  biodata  items  were  developed  which  would  be 
keyed  to  the  appropriate  ABLE  scales.  Because  ABLE’s  relationship  with  enlisted 
attrition  and  leadership  potential  has  been  demonstrated,  linking  biodata  to  the  ABLE 
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could  detennine  l)if  the  same  relationships  hold  for  cadets,  2)  whether  the  ABLE 
constructs  could  be  adequately  measured  by  objective,  verifiable  biodata  items,  and  3) 
under  these  conditions,  would  the  ABLE  and  the  biodata  only  account  for  the  same 
variance  in  attrition  and  leadership,  or  would  each  contribute  uniquely  to  accounting  for 
variance  in  these  criteria. 

Also,  as  opposed  to  other  attempts  at  rational  biodata  development,  the  current 
approach  takes  ad^tage  of  the  possibility  that  the  behaviors  or  events  behind  each 
bi^ata  item  may  be  a  result  of  or  an  antecedent  of  several  different  temperaments.  For 
example,  family  birth  order  may  be  predictive  of  both  dominance  and  emotional  stability, 
while  classroom  performance  may  be  related  to  work  orientation  and  energy.  The 
multidimensionality  of  objective  ^e  events,  although  problematic  for  the  typical 
temperament  scale,  is  an  important  feature  of  biodata  which  should  be  capitalized  upon, 
rather  than  ignored.  Conversely,  perhaps  keying  first  to  ABLE  constructs,  and  then 
using  the  predetermined  key  without  reference  to  the  criterion,  would  show  greater 
immunity  to  shrinkage  than  that  typical  of  empirically  keyed  biodata. 

In  summary,  there  were  a  number  of  important  purposes  for  administering  both 
ABLE  and  biodata  at  West  Point.  First,  the  feasibility  of  using  objective  and  verifiable 
biodata  items  to  measure  temperament  constructs  was  explored,  ^cond,  keying  biodata 
to  ABLE  was  examined  as  a  quasi-rational  approach  which  would  enable  the  use  of  an 
empirically  derived  biodata  measure  without  ^e  shrinkage  in  validities  often  associated 
with  criterion'keyed  measures.  Third,  biodata  and  temperament  analogs  were  compared 
in  terms  of  their  relationship  with  attrition  and  leadersUp,  as  well  as  their  vulnerability 
to  faking.  Finally,  the  incremental  contribution  of  both  ABLE  and  the  biodata  analogs 
over  and  above  that  of  the  Whole  Candidate  S«>re  currently  used  at  West  Point  was 
examined. 


METHOD 


Sample 

The  incoming  USMA  Qass  of  1994  served  as  the  sample  for  this  research.  The 
class  was  made  up  of  1338  plebes,  of  which  132S  participated.  Of  the  1325, 1164  (88%) 
were  men  and  161  were  women.  The  incoming  class  represented  approximately  10%  of 
the  total  applicants,  so  that  the  subjects  are  a  select  group,  with  exp)ected  restriction  of 
range  on  many  of  the  variables. 


Questionnaire  Development 

The  complete  questionnaire  was  administered  in  July,  1990,  shortly  after  their 
arrival  at  West  Point.  Three  measures  were  included  in  the  questionnaire: 
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Biodata  questioimaira.  A  73>item  biographical  data  questionnaire  was  developed 
for  this  research.  A  number  of  the  items  or  item  topics  iq>peared  in  previous  biodata 
forms  (England,  1971;  Glennon  et  al.,  1966;  Richard^n  et  al.,  1985),  while  others  were 
developed  expressly  for  this  research.  Items  were  included  if  they  addressed  behaviors 
or  events  seen  as  relating  to:  (1)  the  criteria  of  interest,  with  leadership  performance  as 
the  primary  criterion,  and  attrition  from  USMA  as  the  secondary  one;  (2)  the  ABLE 
temperaments  included  in  the  research,  espedally  Dominance;  or  (3)  aspects  of  military 
adaptability  and  other  constructs  not  covered  on  the  version  of  the  AB1£  being  used. 
Those  falling  into  the  last  category  included  interpersonal  style,  preference  for  rugged 
pastimes,  and  quality  of  familial  structure  and  relationships. 

There  were  a  number  of  constraints  involved  in  item  development  First  as 
mentioned  abo>'e,  preference  was  to  be  given  to  objective  and  verifiable  behaviors,  even 
when  the  conceptual  relationship  to  the  constructs  was  more  tenuous.  Second,  un^e 
biodata  measures  used  to  predict  adult  success  in  work  situations,  the  subjects  in  this 
research  did  not  have  dirertly  applicable  "work  experience*  as  either  soldiers  or 
commanders,  so  that  the  option  of  fitting  items  to  a  detailed  job  analysis  was  not 
feasible.  Third,  test  administration  had  to  be  accomplished  within  ti[^t  time  constraints, 
thus  forcing  the  abandonment  of  numerous  potentially  useful  items. 

Hundreds  of  items  were  reviewed  for  potential  inclusion  in  the  questionnaire, 
from  which  an  initial  pool  of  124  items  were  developed.  Subsequently,  30  items  which 
were  perceived  as  intrusive  or  likely  to  generate  hostility  from  the  respondents  were 
dropped,  which  had  the  effect  of  minimizing  coverage  of  some  temperaments,  notably 
Emotional  Stability.  The  remaining  94-item  questionnaire  was  then  shortened  to  66 
items  because  of  time  constraints  during  a  subsequent,  cross-validation  administration. 

An  additional  seven  items  came  from  a  97-item  extracurricular  activity  and  sports 
participation  checklist  used  previously  in  Air  Force  research.  The  checklist  asked  about 
leadership  roles  in  22  different  high  school  extracurricular  organizations  or  activities,  and 
participation  and  leadership  in  25  varsity  sports.  Because  of  low  variances  on  a  number 
of  activities  and  sports,  as  well  as  cross-validation  time  constraints,  the  activities  items 
and  18  of  the  sports  in  the  extracurricular  activities  section  were  dropped  from  the  cross- 
validation  measure  and  from  further  analysis.  For  each  of  the  seven  remaining  sports 
items,  questions  about  sport  participation,  having  lettered  in  the  sport,  and  team 
captaincy  were  combined  into  a  single  item.  Thus,  the  final  73-item  biodata  measure 
was  made  up  of  66  of  the  potentially  best  items  from  the  94-item  version,  as  well  as  7 
sports  items  from  the  97-item  activity  and  sports  inventory. 

ABl  F.  An  88-item  version  of  ABLE  was  assembled  for  this  research.  The 
measure  included  the  following  scales:  a  21-item  Emotional  Stability  scale;  a  10-item 
Dependability  scale,  here  composed  primarily  of  items  dealing  with  endorsement  of 
traditional  values,  as  opposed  to  other  forms  of  ABLE,  which  also  include 
nondelinquency  items  in  the  Dependability  construct;  a  14-item  Work  Orientation  scale; 
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a  12-item  Dominance  scale;  and  an  18-item  Energy  scale.  An  11-item  Validity  scale, 
designed  to  detect  persons  whose  responses  are  consistently  contaminated  with  socially 
desirable  and/or  dishonest  responses  (Hough  ei  al.,  1990),  was  also  included. 

In  addition,  the  selection  measures  currently  used  at  USMA  were  included  in  the 
research  for  the  purpose  of  determining  the  incremental  contribution  of  ABLE  and  the 
biodata.  The  primary  measure  is  the  weighted  comp<»ite  called  the  Whole  C^didate 
Score.  However,  other  WCS  components  were  also  evaluated  individually  against  the 
criteria,  in  order  to  isolate  the  determinants  of  success  on  each  criterion.  These  were 
scores  on  the  SAT  (V+M  combined),  high  school  rank,  the  Leadership  Potential  Score 
(LPS),  and  the  Physical  Aptitude  Examination  (PAE),  all  of  which  were  described 
earlier. 


Keying  Procedures  and  Strategies 

In  keying  the  biodata,  a  balance  was  strudc  between  the  rational  and  empirical 
approaches.  While  rational,  a  priori  keying  assumes  that  relationships  between  item  and 
criteria  should  be  intuitively  obvious,  an  empirical  strategy  allows  for  less  obvious  and 
more  complex  relationships  to  be  uncovered.  However,  an  overly  empirical  approach 
could  lead  to  the  coding  of  items  in  illogical  ways  that  are  unlikely  to  be  replicated  in 
future  samples.  For  this  reason,  a  number  of  experienced  practitioners  commonly  use 
some  judgement  in  empirical  keying.  Based  on  consultations  with  some  practitioners, 
including  Mumford  (personal  communication),  the  following  strategies  for  logically 
tempering  "dustbowl  empiridsm"  with  a  more  theoretical  "rainforest  empiridsm"  (Mael, 
in  press)  emerged. 

One  issue  concerns  the  correct  keying  of  non-continuous  items,  such  as  "Which  of 
these  courses  did  you  enjoy  most?".  The  experts  advised  treating  each  response 
alternative  as  a  separate  item,  so  that  those  choosing  "Math"  were  contrasted  with  all 
others,  as  were  those  choosing  "English",  "Science",  etc.  The  reasoning  is  that  the  exact 
configuration  of  the  five  choices  may  be  too  idio^cratic  to  be  replicated  consistently, 
thus  leading  to  increased  shrinkage  upon  aoss-validation.  When  two  or  more 
alternatives  form  a  logical  subset,  they  would  of  course  be  combined,  and  contrasted  as  a 
unit  to  the  other  options.  Thus,  a  non-continuous  item  with  five  alternatives  could 
actually  be  used  as  up  to  five  separate  items  (Hogan  &  Stokes,  1989). 

Another  common  problem  regards  items  that  contaiii  alternatives  chosen  by  few 
people.  For  example,  in  the  question  "How  much  sleep  do  you  need  per  night?",  if  only 
3%  of  subjects  respond  "5  hours  or  less*  to  the  question,  the  mean  associated  with  that 
response  likely  be  unreliable.  Therefore,  for  the  present  research,  alternatives 
chosen  by  less  than  10%  of  the  sample  were  considered  low  frequent  alternatives,  and 
treated  in  one  of  two  ways.  If  the  item  was  continuous,  as  in  the  example  above,  ^e 
low-frequency  response  was  combined  with  an  adjacent  response.  In  this  example,  the 
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"5  hours  or  less"  response  group  would  be  merged  with  the  "6>7  hours*  response  group  to 
form  one  category.  With  a  non*continuous  item,  low  frequency  responses  were  coded  ”1" 
and  set  at  the  mean,  or,  in  the  case  of  dichotomous  coding,  set  to  the  same  value  as  the 
rest  of  the  "other"  category.  In  both  these  cases,  these  adjustments  would  minimize  the 
correlations  with  the  criterion,  but  would  be  expected  to  provide  more  conservative  and 
stable  indications  of  underlying  relationships.  Items  having  overall  poor  variance  (i.e. 
lacking  at  least  two  response  Voices  each  endorsed  by  10%  of  the  respondents) 
inevitably  did  not  correlate  with  any  criteria,  and  therefore  had  to  be  dropped 
completely. 

Another  issue  involves  possible  illogical  keying  of  items  based  on  strict 
empiricism.  For  example,  in  the  item  "How  many  years  did  you  play  varsity  chess  in  high 
school?",  suppose  that  the  criterion  means  for  responses  on  this  sample  were  2.8  ("not  at 
all"),  3.1  ("1  year"),  3.4  (2  years"),  3.0  (3  years"),  and  3.7  ("4  years").  Using  a  strict 
empirical  key,  one  would  have  to  assign  a  lower  value  to  3*year  p^dpation  than  1  or  2 
year  partidpation.  However,  barring  a  compelling  post>hoc  theory,  one  would  probably 
assume  a  sample-spedfic  quirk,  espedally  if  the  sample  was  only  moderate-sized.  Rather 
than  code  it  this  way  and  incur  significant  shrinkage,  a  more  logical  approach  would  be 
to  fit  this  response  within  the  continuum  and  a^pt  a  smaller  derivation  sample 
correlation  in  return  for  a  more  stable  estimate  of  true  population  values. 


Keying  to  ABLE  Scales 

In  the  current  research,  keying  to  ABLB  was  empirical,  although  a  good  deal  of 
the  logical  discretion  described  above  was  used  in  assigning  weights.  Keying  of  items  to 
each  ABLE  scale  involved  several  steps.  First,  means  on  the  ABLE  scale  for  each 
biodata  item  response  were  calculated.  Next,  a  0, 1,  or  2  was  assigned  to  each  response 
alternative.  If  the  response  fell  within  .05  of  the  mean,  it  was  considered  to  be  at  the 
mean  and  was  assigned  a  value  of  1.  Responses  with  means  greater  than  .05  above  the 
mean  were  assigned  a  2  while  responses  with  means  greater  than  .05  below  the  mean 
were  assigned  a  0.  If  no  responses  were  more  than  .05  away  from  the  mean  but  two 
heavily-endorsed  responses  were  further  than  .05  from  each  other,  those  responses  were 
coded  0  and  1  or  1  and  2,  depending  on  whether  the  higher  or  lower  choice  was  closer  to 
the  mean.  Based  on  advice  from  other  practitioners,  options  were  limited  to  0, 1,  and  2, 
even  if  a  4  or  5-point  continuum  was  feasible.  Examples  of  keying  items  in  this  manner 
are  presented  in  Appendix  A. 

Once  all  items  were  coded  in  this  way,  they  were  correlated  with  each  ABLE 
scale.  Items  with  significant  correlations  of  at  least  .075  with  a  scale  were  used  to  aeate 
each  of  five  ABLE-equivalent  biodata  scales.  This  .07-.08  value  was  indicated  by 
Mumford  (personal  communication)  as  generally  being  the  minimum  threshold  for 
stability  upon  cross-validation.  The  five  scales  were:  Bio-Emotional  Stability  (22  items); 
Bio-Dependability  (27  items);  Bio- Work  Orientation  (32  items);  Bio-Dominance  (57 
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items);  and  Bio>Energy  (40  items).  As  mentioned  above,  the  item  pools  for  etch  scale 
were  not  mutually  exclusive,  and  no  attempt  was  made  to  derive  factorially  distinct 
scales. 


Finally,  a  biodata  composite  for  the  whole  ABLE  was  created.  To  do  this,  the 
best  codings  of  each  item,  regardless  of  which  ABLE>keyed  scale  they  had  come  from, 
were  utilized  to  form  a  composite,  representing  the  best  of  the  five  temperament>keyed 
scales.  The  resultant  75  item  scale  was  called  Bioabsum. 

Criterion  measures.  Three  criterion  measures  were  used  for  this  research.  The 
first  was  attrition  from  the  initial  six-week  basic  training  period  known  colloquially  as 
"Beast  Banacks",  which  takes  place  before  the  onset  of  classes.  The  second  was  ratings 
of  demonstrated  leadership  capability,  which  were  also  collected  at  the  end  of  the  six- 
week  training  period.  The  third  criterion  was  ratings  of  demonstrated  leadership 
capability,  which  were  collected  at  the  end  of  the  fint  semester  of  classes  in  December, 
1990.  Although  the  leadership  rating  scales  for  the  six-week  and  fall  periods  were 
identical,  the  moderate  correlation  between  the  two  measures  (;  ■  JS),  as  well  as 
evidence  of  differential  relationships  with  the  predictors,  served  as  compelling  grounds 
not  to  combine  the  ratings  or  treat  them  as  repeated  measures  of  the  same  criterion. 

RESULTS 


ABLE  and  Bioabsum 

Descriptive  statistics  for  the  five  ABLE  scales  used  in  this  research  are  shown  in 
Table  3. 

The  intercorrelations  between  the  ABLE  scales  are  also  shown  in  Table  3.  All 
ABLE  reliabilities  were  in  the  acceptable  range,  and  were  comparable  to  those  in 
previous  ABLE  research. 

The  correlations  between  each  of  the  biodata  scales  keyed  to  ABLE  scales  and 
the  ABLE  scales  appear  in  Table  4.  The  correlation  between  the  composite  biodata 
scale  Bioabsum  and  the  overall  ABLE  also  appears  in  Table  4.  As  can  be  seen,  the 
correlations  between  each  ABLE  scale  and  its  equivalent  biodata  scale  range  between 
37  and  33.  The  only  off-diagonal  correlations  between  ABLE  scales  and  ^e  biodata 
scales  of  other  ABLE  scales  that  were  Oi  similar  magnitude  were  those  between  Bio- 
Dependability  and  ABLE  Work  Orientation  (i  «30)  and  between  Bio-Energy  and 
ABLE  Dominance  (i  «  39). 
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Table  3 


Descriptive  Statistics  and  Intercorrelations  for  ABLE  scales 


Variable 

Items  Mean 

SD 

1 

2 

3 

4 

5 

6 

1.  Emotional  Stability 

21 

236 

30 

34 

2.  Dependability 

10 

234 

38 

.18 

.70 

3.  Work  Orientation 

14 

237 

37 

.18 

.49 

34 

4.  Dominance 

12 

233 

32 

36 

.17 

33 

32 

5.  Energy 

18 

234 

32 

37 

38 

31 

.44 

31 

6.  ABLE  Total 

75 

2.40 

J2 

.73 

35 

.69 

.64 

34 

.92 

n  ■  1324.  For  all  correlations,  p  <  .001. 

Alpha  coefndent  appears  in  diagonal 

Table  4 

Intercorrelations  Between  ABLE-Keyed  Biodata  Scales  and  ABLE  Scales 

Variable 

Items 

ES 

Dep 

WO 

Dom 

EN 

ABLE 

Bio-Emotional  Stability 

22 

37 

.07* 

.17 

37 

33 

37 

Bio-Dependability 

27 

.03# 

.42 

30 

.15 

32 

33 

Bio-Work  Orientation 

32 

.09 

34 

33 

36 

37 

.40 

Bio-Dommance 

57 

30 

.11 

37 

.49 

39 

38 

Bio-Energy 

40 

38 

.18 

34 

39 

.44 

.47 

Bioabsum 

75 

,19 

35 

.44 

.41 

34 

.45 

n  *  1314-1334;  #  »=  n.5.;  *  ■  y  <  .05;  for  all  others,  g  <  .001 


It  should  be  noted  that  the  ABLE  scales  were  themselves  not  orthogonal,  with 
correlations  between  scales  ranging  from  .17  to  S7.  In  addition,  because  the  same  items 
were  used  on  multiple  biodata  scales,  there  were  large  correlations  between  some  of  the 
biodata  scales,  with  is  ranging  between  .08  and  .85.  Thus,  some  degree  of  overlap  in  the 
off-diagonal  coefficients  was  inevitable.  In  spite  of  this,  to  a  great  extent  the  biodata  did 
manage  to  capture  the  specific  ABLE  constructs  that  they  were  keyed  to,  and 
demonstrated  some  degree  of  discrimination  in  their  relationships  to  the  ABLE  scales. 
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Interrelationships  Among  Predictors 


In  Table  S,  the  intercorrelations  of  the  ABLE,  Bioabsum,  the  USMA  Whole 
Candidate  Score  (WCS),  and  each  of  the  component  USMA  predictors  (SAT,  hi^ 
school  rank,  LPS,  and  PAE)  for  the  cadet  sample  is  shown.  It  appears  that  high  school 
rank,  perhaps  as  a  correlate  of  grade-point  average,  is  most  stron^y  related  to  a  cadet’s 
WCS  score.  However,  it  must  be  stressed  that  these  relationships  reflect  a  great  deal  of 
range  restriction  on  the  variables.  In  other  words,  there  is  no  doubt  that  for  the  total 
candidate  population,  aptitude  scores  are  the  greatest  determinant  of  WCS  scores,  and 
that  the  10%  of  candidates  accepted  at  USMA  had  significantly  higher  SAT  scores  than 
the  90%  who  were  rejected.  Table  5  reflects  relationships  within  the  group  that  was 
accepted,  and  demonstrates  that  high  school  rank  (and,  inference,  GPA)  is  the 
greatest  determinant  of  WCS  rankings  within  this  select  group  of  acceptees.  By  the  same 
measure,  the  ABLE  and  biodata  scores  of  those  who  were  rejected  may  also  would  have 
been  lower  and  more  varied  than  those  of  cadets,  because  these  measures  were  also 
subject  to  range  restriction. 

In  this  sample,  ABLE  had  a  small  but  significant  relationship  with  all  components 
of  the  WCS  except  for  SAT.  This  may  be  consistent  with  previous  evidence  of  little 
overlap  between  ABLE  and  ASVAB,  the  militaiy  aptitude  test,  or  it  may  again  be  a 
function  of  range  restriction.  Bioabsum  showed  the  largest  relationship  with  LPS,  which 
in  part  also  captures  background  experiences.  Yet,  it  also  relates  more  strongly  to  high 
school  rank  than  either  the  LPS  or  ABLE,  as  would  be  expected  from  the  wider 
coverage  of  topics  in  the  biodata  measure.  Surprisingly,  both  Bioabsum  and  the  LPS  had 
negative  correlations  with  SAT  scores. 

Table  5 

Intercorrelations  Between  ABLE,  ABLE>Keyed  Biodata  (Bioabsum),  and  Current  USMA  Predictors 


Variable 

1 

2 

3 

4  S 

6 

1.  ABLE  Total 

2.  Bioabsum 

.45 

3.  WCS 

.11 

.24 

4.  SAT 

.01* 

..06* 

.21 

5.  High  School  Rank 

.09 

.22 

.71 

-.03' 

6.  LPS 

.13 

31 

.12 

-.10  .12 

7.  PAE 

.12 

srr 

-.11 

-.09  -.11 

.09 

n  -  1314-1334;  #  -  ils.;  * 

-  8  <  .05; 

V 

cat 

a 

• 

• 

.01;  for  all  others,  a  <  >001 
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Relationships  to  Six-week  Attrition 


The  correlations  between  the  ABLE  scales,  the  equivalent  biodata  scales,  the 
USMA  predictors,  and  six-week  attrition  appear  in  Table  6.  It  should  be  noted  that 
because  attrition  was  a  dichotomous  criterion  measure,  and  there  was  less  than  10% 
attrition  in  the  sample,  the  maximum  correlation  possible  was  35,  as  explained  by 
Nunnally  (1978,  p.  146).  Nevertheless,  each  of  the  ABLE  scales  was  related  to  attrition, 
with  the  relationships  for  Emotional  Stability,  Dependability,  and  Energy  Level  being 
highest.  Each  of  the  biodata  scales  was  related  to  attrition  as  well.  None  of  the  ABLE 
scales  had  a  significantly  higher  or  lower  relationship  to  attrition  than  the  equivalent 
biodata  scale. 

Currently,  USMA  does  not  utilize  any  measure  to  anticipate  six-week  attrition. 
Thus,  strictly  speaking,  the  WCS  was  not  developed  as  an  indicator  of  attrition  from 
cadet  basic  training.  However,  as  it  is  the  current  USMA  basis  for  selection,  it  is  of 
interest  to  compare  the  performance  of  the  WCS  and  its  components  to  that  of  the 
ABLE  and  biodata.  In  contrast  to  the  ABLE  and  biodata,  the  WCS  was  not  related  to 
attrition  for  this  sample.  This  may  be  explained  by  the  dominant  influence  of  high 
school  rank,  which  had  a  negative,  non-significant  relationship  to  attrition,  on  WCS 
variance  in  this  sample.  The  SAT,  LPS,  and  PAE  had  small,  significant  relationships 
with  attrition,  though  they  did  not  equal  those  of  ABLE  total  or  Bioabsum,  nor  those  of 
the  most  powerful  ABLE  and  Biodata  scales. 

Incremental  Validity  of  ABLE  and  Biodata.  A  series  of  multiple  regressions  was 
performed  to  determine  the  incremental  contributions  of  the  ABLE  and  biodata  scales 
to  accounting  for  attrition  over  and  above  the  WCS.  Both  the  ABLE  and  Bioabsum 
were  found  to  have  inaemental  validity  when  entered  separately.  By  contrast,  the  WCS 
did  not  account  for  significant  variance  when  entered  wi^  either  the  ABLE  or 
Bioabsum.  When  all  three  were  entered  together,  only  the  ABLE  was  individually 
significant,  thus  demonstrating  considerable  overlap  between  ABLE  and  biodata  when 
keyed  in  this  fashion. 

The  contribution  of  individual  ABLE  and  biodata  scales,  over  and  above  the 
WCS,  was  also  evaluated  in  order  to  pinpoint  the  temperaments  that  played  the  biggest 
role  in  successfully  accounting  for  variance  in  the  criterion.  For  six-week  attrition,  each 
ABLE  scale  made  a  significant  contribution  when  entered  separately  with  WCS. 
However,  when  entered  together,  the  Emotional  Stability  and  Dependability  scales  were 
the  only  ones  to  have  significant  beta  weights. 

In  order  to  do  the  same  assessment  for  the  biodata  scales,  it  was  necessary  to  first 
merge  scales  with  extremely  high  intercorrelations  in  order  to  avoid  multicollinearity.  As 
noted  earlier,  the  biodata  scales  shared  items  and  thus  inevitably  overlapped  at  times, 
even  though  keyed  differentially.  Thus,  Bio-Dependability  and  Bio-Work  Orientation 


18 


(£  «  .85)  were  merged,  as  were  Bio-Emotional  Stability  and  Bio-Energy  (i  bJI).  When 
entered  separately  with  WCS,  each  combined  biodata  scale,  as  well  as  Bio-Dominance, 
provided  incremental  value.  When  entered  together,  however,  only  the  Bio-Emotional 
Stability/Energy  combination  showed  a  significant  individual  contribution.  When  ABLE 
was  also  entered,  however,  no  biodata  andog  was  significant. 


Relationships  to  Six-week  Leadership  Ratings 

The  correlations  of  the  ABLE  scales,  the  equivalent  biodata  scales,  and  the 
USMA  predictors  with  six-week  leader  ratings  appear  in  Table  6.  Each  of  the  ABLE 
scales  was  related  to  leadership  performance,  with  the  relationships  for  Emotional 
Stability,  Dominance,  and  Energy  Level  being  hipest  In  a  departure  from  the  other 
criteria,  two  of  the  biodata  scales,  Bio-Dependability  and  Bio-Work  Orientation,  did  not 
have  a  significant  relationship  with  the  criterion.  Using  the  formula  for  comparing 
correlations  of  two  variables  with  a  third  variable  found  in  Cohen  and  Cohen  (1983,  p. 
56-57),  it  was  determined  that  in  these  two  cases,  the  correlations  for  the  biodata  scales 
were  significantly  lower  than  those  of  the  equivalent  ABLE  scales.  Each  of  the  other 
biodata  scales,  as  well  as  Bioabsum,  was  related  to  attrition,  although  Bioabsum’s 
relationship  with  the  ratings  was  clearly  pulled  down  by  the  inclusion  of  Bio- 
Dependability  and  Bio- Work  Orientation  items.  The  result  was  that  Bioabsum’s 
correlation  with  the  criterion  was  significantly  lower  than  that  of  ABLE  (t^jM  =  3.21, 

p  <  .01). 

Once  again,  the  WCS  was  not  related  to  the  criterion.  The  same  was  true  of  SAT 
and  high  school  rmik.  Conversely,  the  PAE  had  a  significant  relationship  to  the  ratings, 
while  the  LPS  had  a  smaller,  but  still  significant,  relationship  as  well. 

Incremental  Validity  of  ABLE  and  Biodata.  An  identical  series  of  multiple 
regressions  was  performed  to  determine  the  incremental  contributions  of  the  ABLE  and 
biodata  scales  over  and  above  the  WCS.  Both  the  ABLE  and  Bioabsum  provided 
incremental  validity  when  entered  separately.  By  contrast,  the  WCS  did  not  account  for 
significant  variance  when  entered  with  either  the  ABLE  or  Bioabsum.  When  aU  three 
were  entered  together,  only  the  ABLE  had  a  significant  value. 

Each  ABLE  scale  made  a  significant  contribution  when  entered  separately  with 
WCS.  However,  when  entered  together,  the  Emotional  Stability  scale  was  the  only  one 
to  have  a  significant  beta  weight. 
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Table  6 


Correlations  of  ABLE  Scales,  Biodata  Keyed  to  ABLE,  and  USMA  Predictors  with  Attrition  and  Leadership 
Criteria 


Variable 

Six-Week 

Attrititm 

Six-Week 

Leadership 

FaU 

Leadership 

Emotional  Stability 

.12 

.16 

.02# 

Bio-Emotional  Stability 

20 

.17 

.05* 

Dependability 

22 

.10 

.14 

Bio-Dependability 

.08** 

-.01# 

.13 

Work  Orientation 

j08** 

.10 

.14 

Bio-Work  Orientation 

j08** 

-.01# 

.12 

Dominance 

J06* 

.12 

.07** 

Bio-Dominance 

.09 

.07** 

.0>' 

Energy 

.11 

.18 

JOb* 

Bio-Energy 

.10 

.15 

.08*  • 

ABLE  Total 

.14 

.19 

.11 

Bioabsum 

.11 

.07* 

.10 

WCS 

.01# 

.04# 

20 

SAT 

.07** 

.01# 

.02# 

High  School  Rank 

-.04# 

-.01# 

.17 

LPS 

h8** 

.C6* 

sn* 

PAE 

.06* 

.17 

.02# 

n  «  1314-1334  (attrition);  118S-1191  (leadership);  #  ■  bus.;  *  -  p  <  .05;  **  ■  p  <  .01; 
for  all  others,  p  <  .001 


When  entered  separately  with  WCS,  the  combined  Bio-Emotional 
Stability/Energy  scale  and  the  Bio-Dominance  scale  each  provided  incremental  value. 
When  entered  together,  only  the  Bio-Emotional  Stability/Energy  combination  showed  a 
significant  individual  contribution.  This  remained  true,  however,  even  when  ABLE  was 
also  entered.  Surprisingly,  the  biodata  Dominance  scale  had  a  significant  negative 
relationship  to  the  criterion  when  entered  with  WCS,  ABLE,  and  Bio-Emotional 
Stability/Energy  scale,  suggesting  a  possible  role  as  a  suppressor. 


Relationships  to  Fall  Semester  Leadership  Ratings 

The  conelations  of  the  ABLE  scales,  the  equivalent  biodata  scales,  and  the 
USMA  predictors  with  fall  semester  leader  ratings  appear  in  Table  6.  Each  of  the 
ABLE  scales,  with  the  exception  of  Emotional  Stabihty,  was  related  to  fall  ratings,  with 
the  relationships  for  Dependability  and  Work  Orientation  being  highest  The 
relationships  with  Dominance  and  Energy  Level  were  lower  than  they  were  v«dth  the 
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other  criteria.  1^  contrast,  each  of  the  biodata  scales,  including  Bio>Eniotional  Stability, 
was  related  to  these  leadership  ratings.  In  each  case,  the  relationship  to  the  ratings  was 
comparable  to  that  of  the  equivalent  ABLE  scale,  with  no  statistically  significant 
differences.  Overall  ABLE  and  Bioabsum  also  showed  their  closest  proo^ty  to  each 
other  with  this  criterion. 

In  contrast  to  the  previous  criteria,  WCS  had  the  strongest  relationship  of  aity 
predictor  to  the  fall  rating.  The  correlation  between  high  school  rank  and  ^e  fall 
ratings  was  also  hi^er  than  that  of  any  ABLE  or  biodata  scales.  The  UPS  had  a  small 
but  significant  relationship  with  fall  ratings,  while  the  SAT  and  PAE  did  not 

Incremental  Validity  of  ABLE  and  Biodata.  The  same  series  of  multiple 
regressions  were  performed  with  this  criterion.  Both  the  ABLE  and  Bioabsum  were 
found  to  add  incremental  validity  when  entered  separately  with  WCS.  As  opposed  to  the 
previous  criteria,  when  Bioabsum  and  ABLE  were  entered  together  without  WCS,  the 
biodata  scale  added  significantly  to  the  coefficient  Once  again,  though,  when  entered 
with  both  ABLE  and  WCS,  Bioabsum  did  not  add  significant  v^ance.  In  this  instance, 
WCS  also  accounted  for  significant  variance  when  entered  with  either  ABLE  or 
Bioabsum,  or  with  both  together.  Apparently  the  redundant  of  Bioabsum  derives  from 
partial  overlap  with  both  ^LE  and  WCS,  rather  than  from  extensive  overlap  with 
ABLE. 

Three  ABLE  scales  (Dependability,  Work  Orientation,  and  Dominance)  made 
significant  contributions  to  WC^  when  entered  separately.  However,  when  entered 
together,  only  Dependability  had  a  significant  beta  weight.  When  the  biodata  scales 
were  entered  separately  with  WCS,  Bio«Emotional  Stability/Energy  alone  provided 
incremental  value.  Entered  simultaneously,  however,  none  provided  a  significant 
contribution,  demonstrating  the  possibility  of  still  more  multicollinearity. 


Sodal  Desirability  Analyses 

Table  7  shows  the  correlation  between  the  ABLE  validity  (social  desirability 
detection)  scale  and  each  of  the  ABLE  scales,  as  well  as  the  ove^  ABLE.  The  same 
correlations  are  shown  for  the  biodata  scales  keyed  to  each  ABLE  scale,  and  the  overall 
ABLE  composite  (Bioabsum).  In  each  cast,  the  correlation  with  social  desirability  for 
the  ABLE  scale  was  significantly  higher  Jian  the  correlation  for  the  equivalent  biodata 
scale  (Emotional  Stability,  tuM  *  230,  p  <  .05;  Dependability,  tuM  -  234,  p  <  .05; 
Work  Orientation,  tia*  •  6*12,  p  <  .01;  Energy,  -  331,  p  <  .01;  overall  ABLE 
versus  Bioabsum,  tu^  «  6.12,  p  <  .01).  The  sole  exception  was  Dominance,  for  which 
both  the  ABLE  and  biodata  scales  had  small  relationships  to  the  social  desirability  scale. 
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Table  7 


Correlations  of  ABLE  Scales,  Biodata  Keyed  to  ABLE,  and  USMA  Predictors  with 
ABLE  Validity  (Social  Desirability)  Scale 


Variable 

SD  Scale 

Variable 

SD  Scale 

Emotional  Stability 

.16 

WCS 

.04* 

Bio-Emotional 

.09 

SAT 

-.03* 

Dependability 

31 

High  School 

.09 

Bio-Dependability 

35 

LPS 

.06 

Work  Orientation 

39 

PAE 

-.00 

Bio-Work 

34 

Dominance 

.09 

Bio-Dominance 

.07 

Energy 

.23 

Bio-Energy 

.13 

ABLE  Total 

34 

Bioabsum 

.18 

n  *  1314-1334;  #  *  n.s.;  •  »  p  <  .05;  »  b  <  .01;  for  aU  others,  p  <  .001 


Social  desirability  was  not  related  to  performance  on  any  of  the  criteria  in  the 
research.  It  was  unrelated  to  six-week  attrition  (i  >  .04),  summer  leader  ratings  (i  = 
.04),  and  fall  leader  ratings  (i  *  *.01).  Some  previous  studies  have  shown  socially 
desirable  responding  or  faking  to  be  criterion-related.  When  positively  related,  it  has 
been  interpreted  as  demonstrating  self-esteem  (Hogan  &  Stokes,  1989;  Zerbe  &  Paulhus, 
1987),  and  when  negatively  related,  it  has  been  interpreted  as  measuring  defensiveness 
and  approval-seeking  (Crosby,  1990;  Crowne  &  Marlowe,  1960).  In  either  case,  it  has 
been  treated  as  meaningful  variance  by  these  researchers,  rather  than  measurement 
error.  Qearly,  in  this  sample,  socially  desirable  responding  did  not  account  for 
significant  variance  in  any  of  the  criteria.  Furthermore,  partialling  the  effects  of  those 
scores  from  the  ABLE  and  biodata  predictor-criterion  relationships  did  not  affect  those 
relationships  in  any  way. 

Although  there  was  a  significant,  positive  relationship  between  the  social 
desirability  scale  and  each  of  the  ABLE  and  biodata  scales,  there  was  no  relationship 
between  the  social  desirability  scale  and  either  WCS,  SAT,  or  the  PAE.  However,  there 
was  a  positive  relationship  wiA  the  UPS.  It  should  be  noted  that  common  method 
variance  with  the  ABLE  and  biodata  scales  that  appeared  in  the  same  instrument  with 
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the  validity  scale  would  exaggerate  their  relationships.  Surprisingly,  the  relationship  with 
high  school  rank  was  positive  and  comparable  to  that  of  some  of  ^  biodata  scales,  even 
though  the  USMA  hi^  school  rankings  were  not  derived  from  self-report  sources. 

DISCUSSION 

The  current  research  effort  was  conducted  under  a  number  of  severe  constraints. 
First,  because  incoming  cadets  rather  than  applicants  were  sampled,  the  range  of 
variance  on  all  predictor  variables  was  sharply  restricted.  The  attrition  criterion  measure 
was  also  restricted,  limiting  the  maximum  correlation  coefficient  to  35.  Second,  although 
typical  biodata  inventories  often  have  200-300  items,  time  constraints  in  the  USMA 
setting,  especially  in  the  cross-validation  effort,  limited  the  researchers  to  73  items. 

Third,  the  item  pool  was  limited  to  external,  objective,  and  mainfy  verifiable  items  in 
order  to  render  the  instrument  potentially  usable  for  admissions.  Finally,  the  age  and 
lack  of  military  experience  of  the  cadets  made  it  impossible  to  measure  directly  "job-relevant" 
previous  experiences. 

In  spite  of  these  limitations,  the  findings  of  this  research  were  highly  encouraging. 
Five  biodata  scales  were  created  to  parallel  temperament  scales  from  the  Army  ABLE. 

In  each  case,  the  biodata  scales  showed  a  clear  relationship  to  the  equivalent  ABLE 
scale  and  almost  always  a  smaller  relationship  to  the  other  ABLE  scales.  The  biodata 
scales  were  also  compared  with  the  ABLE  scales  in  their  relationship  to  each  of  three 
criterion  measures.  Out  of  a  total  of  IS  such  comparisons,  the  biodata  measures  had  a 
statistically  smaller  relationship  to  the  criterion  in  only  two  cases.  In  some  cases,  the 
biodata  scales  actually  had  a  slightly  higher  relationship  to  the  criterion.  These  results 
demonstrate  that  it  is  possible  to  develop  objective  biodata  measures  that  will  be 
substantially  analogous  to  valid  temperament  measures,  even  under  the  aforementioned 
'onstraints. 

Furthermore,  with  each  criterion,  either  overall  ABLE  or  Bioabsum  (the  biodata 
aivalent)  added  incremental  validity  over  and  above  the  WCS  measure  cuirently  used 
by  USMA.  For  two  of  the  criteria,  Bioabsum  was  redundant  with  ABLE  and  did  not 
account  for  additional  variance,  while  for  the  third  Bioabsum  bad  a  lesser  overlap. 

Ins;  IT  as  the  biodata  were  keyed  to  maximize  their  relationship  with  ABLE  scales  in 
this  research,  this  redundant  is  desirable.  The  resiUts  do  not  preclude  the  possibility 
th&i  the  biodata,  keyed  directly  to  the  criterion,  would  show  less  overlap  with  ABLE  and 
account  for  more  variance  in  the  criteria.  Results  of  empirical  keying  were  not  included 
in  this  report  because  of  the  need  to  properly  atns-validate  empirical  keys. 

Moreover,  another  anticipated  benefit  of  using  biodata  analogs,  that  of  reducing 
vulnerability  to  socially  desirable  responding,  was  also  realized.  Four  of  the  five 
individual  biodata  scales,  as  well  as  the  overall  biodata  scale,  had  a  significantly 
smaller  correlation  with  the  ABLE  validity  scale  than  the  equivalent  ABLE  scale.  Thus, 
while  socially  desirable  responding  did  not  seem  to  contaminate  relationships  of  the 
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predictors  to  the  criteria,  the  use  of  objective  biodata  does  seem  to  provide  a  possible 
minimization  of  the  fald^  problem.  Once  a^dn,  the  present  research  does  not  predude 
the  possibility  that  biodata  keyed  directly  to  ^e  criterion  would  show  leu  vulnerability  to 
socially  desirable  responding  ^an  either  a  temperament  scale  or  even  the  same  biodata 
keyed  to  a  temperament  scale.  In  fact,  initial  tedications  from  empirical  keying  of  these 
scales  suggest  that  this  is  so. 

The  results  are  also  useful  in  pinpointing  whidt  temperament  factors  are  most 
important  in  determining  earty  success  at  West  Point  Cadets  who  attrited  during  the 
six-week  training  tended  to  be  espedally  lower  in  stress  tolerance  (emotional  stability) 
and  in  the  endorsement  of  traditional  ^ues  (dependability).  The  cadet  who  succeed  in 
completing  initial  training  period  may  tolerate  stress,  and  be  willing  to  accept  authority 
and  regimenution,  to  a  greater  degrM  than  peers  who  dioose  to  leave  USl^  at  that 
stage.  Among  those  who  did  not  attrit,  cadets  rated  hi^iest  in  leadership  performance 
during  the  six-week  training  period  were  also  distinguished  most  dearly  by  their  greater 
emotional  stability  and  stress  tolerance.  Finally,  dependability,  again  in  the  sense  of 
endorsement  of  traditional  values,  had  the  strongest  relationship  of  all  temperaments  to 
leadership  behavior  in  the  ratings  from  the  fall  semester. 

One  of  the  puzzling  results  of  this  researdi  was  the  relatively  minor  role  played  by 
the  temperament  construct  referred  to  as  dominance.  Although  it  was  related  to  both 
leadersUp  criteria,  dominance  vras  never  the  primary  predictor  of  either  training  or  ^ 
semester  leadership  ratings.  It  is  possible  that  because  the  primary  role  of  the  plebe  is 
to  be  a  good  team  player,  rather  ^an  to  direct  other  cadets  in  the  accomplishment  of 
their  duties,  the  importance  of  dominance  does  not  become  apparent  untU  later  in  the 
cadet’s  career.  It  would  be  important  to  obtain  criterion  measures  from  a  cadet’s  last 
two  years  at  USMA  before  dismissing  the  role  of  dominance. 

The  relatively  low  relationship  between  summer  and  fall  leadership  ratings 
illustrate  that  suttuner  and  fall  performance  represent  two  components  of  the  cadet 
leader  role,  rather  than  repeated  measures  of  the  same  construct  In  support  of  this 
premise,  it  can  be  seen  that  the  PAE  is  among  the  best  predictors  of  sununer  leadership 
ratings,  yet  had  a  non-significant  relationship  to  fall  ratings.  Conversely,  high  school 
rank  is  strongly  related  to  fall  ratings,  yet  unrelated  to  summer  ratings.  Seemingly, 
successful  summer  performance  is  associated  with  excellent  physical  conditioning  and 
stress  tolerance,  which  is  also  manifested  in  high  energy  level,  while  success  in  the  fall 
semester  relates  strongly  to  previous  academic  achievement  and  dependability.  However, 
the  degree  to  which  leadersUp  propensity  or  ability  accounts  for  variance  in  eadi  of  the 
settings,  given  the  aforementioned  lack  of  actual  leadership  opportunities  for  new  plebes, 
is  unclear.  These  discrepant  results  suggest  the  need  to  equate  the  temperament  and 
biodata  measures  against  leadership  scores  throughout  the  four-year  USMA  experience, 
as  well  as  beyond. 
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In  summary,  the  resultt  of  the  current  research  suggest  a  useful  role  for  biodau  as 
an  indicator  of  success  at  USMA.  Additional  msearch  is  needed  to  substantiate  and 
elaborate  on  these  findings.  First,  the  biodata  need  to  be  keyed  empirically  to  each  of 
the  criteria,  and  the  results  compared  to  the  findings  of  the  current  effort  In  this  way,  it 
can  be  determined  if  keying  biodata  to  temperament  scales  is  an  optimal  or  counter¬ 
productive  use  of  biodata.  Second,  the  research  needs  to  be  replicated,  in  order  to 
evaluate  the  stability  of  the  biodata-to-ABLE  k^  as  well  as  to  cross-^date  the 
en^irical  keys.  This  replication  is  currently  underway.  Finalty,  additional  effort  must  be 
made  to  relate  these  temperament  and  biodata  measures  to  more  longitudinal  measures 
of  leadership  success.  In  this  way,  the  full  value  of  integrating  temperament  and  biodata 
into  USMA  admissions  can  be  determined. 
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APPENDIX 


Examples  of  Empirical  Keying  Employed  in  USMA  Research 


1.  Your  ranking  in  your  graduating  class  was? 


Item  Mean:  2.365 

Response  Mean 

Coded 

a.  in  the  top  5% 

2.456 

2 

b.  in  the  top  6-10% 

2.429 

2 

c.  in  the  top  11-25% 

2.390 

1 

d.  above  the  25% 

2.311 

1 

e.  we  did  not  have  class  rankings 

1.744 

0 

2.  In  your  junior  year  of  high  school,  how  many  hours  have  you  spent  in  an  average 
week  participating  in  sports  and  exercise? 

Item  Mean:  2.532 


Response  Mean  Codfid 


a.  5  or  less 

1.369 

0 

b.  6-10 

2.012 

0 

c.  11-20 

2.390 

1 

d.  21-30 

2.503 

1 

e.  more  than  30 

2.679 

1 

3.  Were  you  a  member  of  the  debate  team  in  high  school? 
Item  Mean:  2.460 


Response  Mean  Codfed 

2.620 

2.012 


a.  yes 

b.  no 


1 


